Search CORE

604 research outputs found

Thermodynamics of deterministic finite automata operating locally and periodically

Author: Ouldridge T
Wolpert DH
Publication venue: IOP Publishing
Publication date: 27/11/2023
Field of study

Real-world computers have operational constraints that cause nonzero entropy production (EP). In particular, almost all real-world computers are 'periodic', iteratively undergoing the same physical process; and 'local', in that subsystems evolve whilst physically decoupled from the rest of the computer. These constraints are so universal because decomposing a complex computation into small, iterative calculations is what makes computers so powerful. We first derive the nonzero EP caused by the locality and periodicity constraints for deterministic finite automata (DFA), a foundational system of computer science theory. We then relate this minimal EP to the computational characteristics of the DFA. We thus divide the languages recognised by DFA into two classes: those that can be recognised with zero EP, and those that necessarily have non-zero EP. We also demonstrate the thermodynamic advantages of implementing a DFA with a physical process that is agnostic about the inputs that it processes

Spiral - Imperial College Digital Repository

TreeGrad: Transferring Tree Ensembles to Neural Networks

Author: C Siu
DH Wolpert
F Pedregosa
JA Blackard
JH Friedman
K Nakai
L Breiman
SK Murthy
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/12/2019
Field of study

Gradient Boosting Decision Tree (GBDT) are popular machine learning algorithms with implementations such as LightGBM and in popular machine learning toolkits like Scikit-Learn. Many implementations can only produce trees in an offline manner and in a greedy manner. We explore ways to convert existing GBDT implementations to known neural network architectures with minimal performance loss in order to allow decision splits to be updated in an online manner and provide extensions to allow splits points to be altered as a neural architecture search problem. We provide learning bounds for our neural network.Comment: Technical Report on Implementation of Deep Neural Decision Forests Algorithm. To accompany implementation here: https://github.com/chappers/TreeGrad. Update: Please cite as: Siu, C. (2019). "Transferring Tree Ensembles to Neural Networks". International Conference on Neural Information Processing. Springer, 2019. arXiv admin note: text overlap with arXiv:1909.1179

arXiv.org e-Print Archive

Crossref

Automatic Induction of Neural Network Decision Tree Algorithms

Author: CM Bishop
DH Wolpert
DI Hastie
SK Murthy
T Le
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/04/2019
Field of study

This work presents an approach to automatically induction for non-greedy decision trees constructed from neural network architecture. This construction can be used to transfer weights when growing or pruning a decision tree, allowing non-greedy decision tree algorithms to automatically learn and adapt to the ideal architecture. In this work, we examine the underpinning ideas within ensemble modelling and Bayesian model averaging which allow our neural network to asymptotically approach the ideal architecture through weights transfer. Experimental results demonstrate that this approach improves models over fixed set of hyperparameters for decision tree models and decision forest models.Comment: This is a pre-print of a contribution "Chapman Siu, Automatic Induction of Neural Network Decision Tree Algorithms." To appear in Computing Conference 2019 Proceedings. Advances in Intelligent Systems and Computing. Implementation: https://github.com/chappers/automatic-induction-neural-decision-tre

arXiv.org e-Print Archive

Crossref

An Exact No Free Lunch Theorem for Community Detection

Author: AJ Gates
B Hauer
D Lai
DH Wolpert
F Radicchi
J Zhang
L Hubert
L Peel
L Peel
MEJ Newman
P Zhang
S Romano
TO Kvalseth
Z Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/03/2019
Field of study

A precondition for a No Free Lunch theorem is evaluation with a loss function which does not assume a priori superiority of some outputs over others. A previous result for community detection by Peel et al. (2017) relies on a mismatch between the loss function and the problem domain. The loss function computes an expectation over only a subset of the universe of possible outputs; thus, it is only asymptotically appropriate with respect to the problem size. By using the correct random model for the problem domain, we provide a stronger, exact No Free Lunch theorem for community detection. The claim generalizes to other set-partitioning tasks including core/periphery separation,

k

-clustering, and graph partitioning. Finally, we review the literature of proposed evaluation functions and identify functions which (perhaps with slight modifications) are compatible with an exact No Free Lunch theorem

arXiv.org e-Print Archive

Crossref

Predicting Fluid Intelligence of Children using T1-weighted MR Images and a StackNet

Author: A Pfefferbaum
DH Wolpert
DJ MacKay
EJ Paul
F Pedregosa
H Garavan
JH Friedman
KP Murphy
L Breiman
L Wang
M Luciana
ME Tipping
ND Volkow
P Geurts
SM Jaeggi
T Rohlfing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

In this work, we utilize T1-weighted MR images and StackNet to predict fluid intelligence in adolescents. Our framework includes feature extraction, feature normalization, feature denoising, feature selection, training a StackNet, and predicting fluid intelligence. The extracted feature is the distribution of different brain tissues in different brain parcellation regions. The proposed StackNet consists of three layers and 11 models. Each layer uses the predictions from all previous layers including the input layer. The proposed StackNet is tested on a public benchmark Adolescent Brain Cognitive Development Neurocognitive Prediction Challenge 2019 and achieves a mean squared error of 82.42 on the combined training and validation set with 10-fold cross-validation. In addition, the proposed StackNet also achieves a mean squared error of 94.25 on the testing data. The source code is available on GitHub.Comment: 8 pages, 2 figures, 3 tables, Accepted by MICCAI ABCD-NP Challenge 2019; Added ND

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

DeltaPhish: Detecting Phishing Webpages in Compromised Websites

Author: AY Fu
B Biggio
C Cortes
C Ludl
DH Wolpert
G Chechik
G Xiang
G Xiang
J Hong
KT Chen
L Wenyin
M Khonji
MJ Swain
PF Felzenszwalb
RB Basnet
S Marchal
TC Chen
TC Chen
Publication venue
Publication date: 01/01/2017
Field of study

The large-scale deployment of modern phishing attacks relies on the automatic exploitation of vulnerable websites in the wild, to maximize profit while hindering attack traceability, detection and blacklisting. To the best of our knowledge, this is the first work that specifically leverages this adversarial behavior for detection purposes. We show that phishing webpages can be accurately detected by highlighting HTML code and visual differences with respect to other (legitimate) pages hosted within a compromised website. Our system, named DeltaPhish, can be installed as part of a web application firewall, to detect the presence of anomalous content on a website after compromise, and eventually prevent access to it. DeltaPhish is also robust against adversarial attempts in which the HTML code of the phishing page is carefully manipulated to evade detection. We empirically evaluate it on more than 5,500 webpages collected in the wild from compromised websites, showing that it is capable of detecting more than 99% of phishing webpages, while only misclassifying less than 1% of legitimate pages. We further show that the detection rate remains higher than 70% even under very sophisticated attacks carefully designed to evade our system.Comment: Preprint version of the work accepted at ESORICS 201

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Cagliari

Archivio istituzionale della ricerca - Università di Genova

Decentralized learning with budgeted network load using Gaussian copulas and classifier ensembles

Author: AP Dawid
C Genest
DH Wolpert
ED Sontag
F Pedregosa
GB Giannakis
I Zezula
J Kittler
J Kittler
L Breiman
L Xu
LK Hansen
M Wozniak
OP Faugeras
S Deerwester
TK Ho
V Tresp
Y Freund
Y Koren
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/03/2019
Field of study

We examine a network of learners which address the same classification task but must learn from different data sets. The learners cannot share data but instead share their models. Models are shared only one time so as to preserve the network load. We introduce DELCO (standing for Decentralized Ensemble Learning with COpulas), a new approach allowing to aggregate the predictions of the classifiers trained by each learner. The proposed method aggregates the base classifiers using a probabilistic model relying on Gaussian copulas. Experiments on logistic regressor ensembles demonstrate competing accuracy and increased robustness in case of dependent classifiers. A companion python implementation can be downloaded at https://github.com/john-klein/DELC

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

UCL Discovery

Hal-Diderot

Evidential Bagging: Combining Heterogeneous Classifiers in the Belief Functions Framework

Author: AP Dempster
B Efron
C Cortes
D Dubois
DH Wolpert
DH Wolpert
G Qu
G Shafer
Jérémie François
L Breiman
P Smets
P Smets
P Vannoorenberghe
P Xu
R Polikar
RE Schapire
RR Yager
S Džeroski
T Denoeux
T Denœux
Y Freund
ZH Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/06/2018
Field of study

International audienceIn machine learning, Ensemble Learning methodologies are known to improve predictive accuracy and robustness. They consist in the learning of many classifiers that produce outputs which are finally combined according to different techniques. Bagging, or Bootstrap Aggre-gating, is one of the most famous Ensemble methodologies and is usually applied to the same classification base algorithm, i.e. the same type of classifier is learnt multiple times on bootstrapped versions of the initial learning dataset. In this paper, we propose a bagging methodology that involves different types of classifier. Classifiers' probabilist outputs are used to build mass functions which are further combined within the belief functions framework. Three different ways of building mass functions are proposed; preliminary experiments on benchmark datasets showing the relevancy of the approach are presented

Crossref

Program trace optimization

Author: A Moraglio
DH Wolpert
EK Burke
F Rothlauf
J McDermott
M O’Neill
M O’Neill
P Janssen
T Stützle
V Batishcheva
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/08/2018
Field of study

This is the author accepted manuscript. The final version is available from Springer via the DOI in this record.Paper to be presented at the Fifteenth International Conference on Parallel Problem Solving from Nature (PPSN XV), Coimbra, Portugal on 8-12 September 2018.We introduce Program Trace Optimization (PTO), a system for `universal heuristic optimization made easy'. This is achieved by strictly separating the problem from the search algorithm. New problem definitions and new generic search algorithms can be added to PTO easily and independently, and any algorithm can be used on any problem. PTO automatically extracts knowledge from the problem specifi cation and designs search operators for the problem. The operators designed by PTO for standard representations coincide with existing ones, but PTO automatically designs operators for arbitrary representations

Crossref

Open Research Exeter

Improving the minimum description length inference of phrase-based translation models

Author: ABA Hassanat
CM Bishop
D Aha
D Wilson
DH Wolpert
N Lopes
N Lopes
N Lopes
S Salzberg
TM Cover
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-19390-8_25We study the application of minimum description length (MDL) inference to estimate pattern recognition models for machine translation. MDL is a theoretically-sound approach whose empirical results are however below those of the state-of-the-art pipeline of training heuristics. We identify potential limitations of current MDL procedures and provide a practical approach to overcome them. Empirical results support the soundness of the proposed approach.Work supported by the EU 7th Framework Programme (FP7/2007–2013) under the CasMaCat project (grant agreement no 287576), by Spanish MICINN under grant TIN2012-31723, and by the Generalitat Valenciana under grant ALMPR (Prometeo/2009/014).Gonzalez Rubio, J.; Casacuberta Nolla, F. (2015). Improving the minimum description length inference of phrase-based translation models. En Pattern Recognition and Image Analysis: 7th Iberian Conference, IbPRIA 2015, Santiago de Compostela, Spain, June 17-19, 2015, Proceedings. Springer International Publishing. 219-227. https://doi.org/10.1007/978-3-319-19390-8 25S21922

Crossref

RiuNet

Repositório Institucional do Instituto Politécnico da Guarda